摘要 :
The amount of information nowadays is rapidly growing. Aside from valuable information, information that is unrelated to a target or is meaningless is also growing. Big data and broader digital technologies are considered the prim...
展开
The amount of information nowadays is rapidly growing. Aside from valuable information, information that is unrelated to a target or is meaningless is also growing. Big data and broader digital technologies are considered the primary components of smart city governance and planning. Big data analysis is considered to define a new era in urban planning, research, and policy. Effective data mining and pattern detection techniques are becoming very important these days. Processing such a large amount of data entails the use of data mining, a technique that clarifies the association between valid information and excludes irrelevant data to implement a practical decision tree. A large amount of data affects processing time and I/O costs during data mining. This study proposes to distribute data among multiple clients and distribute a large amount of data computation equally to improve the resource cost problem of exploration. Following that, the main server consolidates the computation results and generates the survey results. Experiment results show that the proposed algorithm is superior, thus allowing a larger amount of data to be processed while producing high-quality results.
收起
摘要 :
Energy saving is a critical issue in many sensor-network-based applications. Among the existing sensor-network-based applications, the surveillance application has attracted extensive attention. Object tracking in sensor networks ...
展开
Energy saving is a critical issue in many sensor-network-based applications. Among the existing sensor-network-based applications, the surveillance application has attracted extensive attention. Object tracking in sensor networks (OTSNs) is a typical surveillance application. Previous studies on energy saving for OTSNs can be divided into two main approaches: (1) improvements in hardware design to lower the energy consumption of attached components and (2) improvements in software to predict the movement of objects. In this paper, we propose a novel scheme, namely hybrid tracking scheme (HTS), for tracking objects with energy efficiency. The scheme consists of the two parts: (1) adaptive schedule monitoring and (2) a recovery mechanism integrated with seamless temporal movement patterns and seeding-based flooding to relocate missing objects with the purpose of saving energy. Furthermore, we also propose a frequently visited periods mining algorithm, which discovers the corresponding frequently visited periods for adaptive schedule monitoring efficiently from the visitation information of sensor nodes. To decrease the number of sensor nodes activated in flooding, a seeding-based flooding mechanism is first proposed in our work. Empirical evaluations of various simulation conditions and real datasets show that the proposed HTS delivers excellent performance in terms of energy efficiency and low missing rates.
收起
摘要 :
The topic on recommendation systems for mobile users has attracted a lot of attentions in recent years. However, most of the existing recommendation techniques were developed based only on geographic features of mobile users' traj...
展开
The topic on recommendation systems for mobile users has attracted a lot of attentions in recent years. However, most of the existing recommendation techniques were developed based only on geographic features of mobile users' trajectories. In this paper, we propose a novel approach for recommending items for mobile users based on both the geographic and semantic features of users' trajectories. The core idea of our recommendation system is based on a novel cluster-based location prediction strategy, namely TrajUtiRec, to improve items recommendation model. Our proposed cluster-based location prediction strategy evaluates the next location of a mobile user based on the frequent behaviors of similar users in the same cluster determined by analyzing users' common behaviors in semantic trajectories. For each location, high utility itemset mining algorithm is performed for discovering high utility itemset. Accordingly, we can recommend the high utility itemset which is related to the location the user might visit. Through a comprehensive evaluation by experiments, our proposal is shown to deliver excellent performance.
收起
摘要 :
In recent years, integrated applications with multimedia devices and wireless sensor networks promoted the evolution of wireless sensor networks, namely wireless multimedia sensor networks (WMSNs). The applications in WMSNs have t...
展开
In recent years, integrated applications with multimedia devices and wireless sensor networks promoted the evolution of wireless sensor networks, namely wireless multimedia sensor networks (WMSNs). The applications in WMSNs have to focus on both energy saving and application-level quality of service (QoS). Due to the characteristics in WMSNs, such as resource constraints and variable channel capacity, efficiently achieving the application-level QoS in WMSNs is a challenging task. To overcome this challenge, in this paper, we proposed a new kind of pattern named temporal region requesting pattern (TRRP) and a novel algorithm named TRRP-Mine for mining TRRPs efficiently. We also designed a temporal region requesting cost function of cache replacement, abbreviated as TRRC, for the cooperative caching multimedia content in WMSNs. Empirical evaluations under various simulation conditions showed that the proposed method delivers excellent performance in terms of hit rate and the number of replacements.
收起
摘要 :
The development of wireless and web technologies has allowed the mobile users to request various kinds of services by mobile devices at anytime and anywhere. Helping the users obtain needed information effectively is an important ...
展开
The development of wireless and web technologies has allowed the mobile users to request various kinds of services by mobile devices at anytime and anywhere. Helping the users obtain needed information effectively is an important issue in the mobile web systems. Discovery of user behavior can highly benefit the enhancements on system performance and quality of services. Obviously, the mobile user's behavior patterns, in which the location and the service are inherently coexistent, become more complex than those of the traditional web systems. In this paper, we propose a novel data mining method, namely SMAP-Mine that can efficiently discover mobile users' sequential movement patterns associated with requested services. Moreover, the corresponding prediction strategies are also proposed. Through empirical evaluation under various simulation conditions, SMAP-Mine is shown to deliver excellent performance in terms of accuracy, execution efficiency and scalability. Meanwhile, the proposed prediction strategies are also verified to be effective in measurements of precision, hit ratio and applicability.
收起
摘要 :
The goal of data mining is to discover hidden useful information in large databases. Mining frequent patterns from transaction databases is an important problem in data mining. As the database size increases, the computation time ...
展开
The goal of data mining is to discover hidden useful information in large databases. Mining frequent patterns from transaction databases is an important problem in data mining. As the database size increases, the computation time and required memory also increase. Because the number of items increases, the user behaviours also become more complex. To solve the problem of increasing complexity, many researchers have applied parallel and distributed computing techniques to the discovery of frequent patterns from large amounts of data. However, most studies have focused on improving the performance for a single task and have neglected the many-task computing issue, which is important in the current cloud-computing environments. In these environments, an application is often provided as a service, e.g., the Google search engine, implying that many users can use it simultaneously. In this paper, we propose a set of algorithms, containing the Equal Working Set (EWS) algorithm, the Request On Demand (ROD) algorithm, the Small Size Working Set (SSWS) algorithm and the Progressive Size Working Set (PSWS) algorithm, for frequent pattern mining that provides a fast and scalable mining service in many-task computing environments. Through empirical evaluations in various simulation conditions, the proposed algorithms are shown to deliver excellent performance with respect to scalability and execution time.
收起
摘要 :
Parallel and distributed computing techniques have attracted extensive attentions on the ability to manage and compute the significant amount of data in the past decades. The difficulty of mining large database launched the resear...
展开
Parallel and distributed computing techniques have attracted extensive attentions on the ability to manage and compute the significant amount of data in the past decades. The difficulty of mining large database launched the research of designing parallel and distributed algorithms to solve the problem. In this paper, we propose a novel data mining algorithm, named Cloud-based Association Rule Mining (CARM), abbreviated as CARM, which is able to efficiently utilise the nodes to discover frequent patterns in cloud computing environments with data privacy preserved. Through empirical evaluations on various simulation conditions, the proposed CARM delivers excellent performance in terms of scalability and execution time.
收起
摘要 :
The advancement of electronic technology enables us to collect logs from various devices. Such logs require detailed analysis in order to be broadly useful. Data mining is a technique that has been widely used to extract hidden in...
展开
The advancement of electronic technology enables us to collect logs from various devices. Such logs require detailed analysis in order to be broadly useful. Data mining is a technique that has been widely used to extract hidden information from such data. Data mining is mainly composed of association rules mining, sequent pattern mining, classification and clustering. Association rules mining has attracted significant attention and been successfully applied to various fields. Although the past studies can effectively discover frequent patterns to deduce association rules, execution efficiency is still a critical problem. To speed up execution, many methods using parallel and distributed computing technology have been proposed in recent years. Most of the past studies focused on parallelizing the workload in a high end machine or in distributed computing environments like grid or cloud computing systems; however, very few of them discuss how to efficiently determine the appropriate number of computing nodes, considering execution efficiency and load balancing. An intuition is that execution speed is proportional to the number of computing nodes-that is, more the number of computing nodes, faster is the execution speed. However, this is incorrect for such algorithms because of the inherently algorithmic design. Allocating too many computing nodes can lead to high execution time. In addition to the execution inefficiency, inappropriate resource allocation is a waste of computing power and network bandwidth. At the same time, load cannot be effectively distributed if there are too few nodes allocated. In this paper, we propose a fast, load balancing and resource efficient algorithm named FLR-Mining for discovering frequent patterns in distributed computing systems. FLR-Mining is capable of determining the appropriate number of computing nodes automatically and achieving better load balancing as compared with existing methods. Through empirical evaluation, FLR-Mining is shown to deliver excellent performance in terms of execution efficiency and load balancing.
收起
摘要 :
The advances in nanometer technology and integrated circuit technology enable the graphics card to attach individual memory and one or more processing units, named GPU, in which most of the graphing instructions can be processed i...
展开
The advances in nanometer technology and integrated circuit technology enable the graphics card to attach individual memory and one or more processing units, named GPU, in which most of the graphing instructions can be processed in parallel. Obviously, the computation resource can be used to improve the execution efficiency of not only graphing applications but other time consuming applications like datamining. TheClusteringAffinity Search Technique is a famous clustering algorithm, which is widely used in clustering the biological data. In this paper, we will propose an algorithm that can utilize the GPU and the individual memory of graphics card to accelerate the execution. The experimental results show that our proposed algorithm can deliver excellent performance in terms of execution time and is scalable to very large databases.
收起